Synthetic Estimation Process for Local Data


The BRFSS is an ongoing telephone survey consisting of interviews conducted each month. In 2021, the sample dataset includes 6,419 surveys divided into eight different Pennsylvania health regions: Northwest, Southwest (excluding Allegheny county), Northcentral, Southcentral, Northeast, Southeast (excluding Philadelphia county), Allegheny and Philadelphia counties.

On the state level, data from the BRFSS serve several purposes. BRFSS data help to identify subgroups, which should be targeted for health promotion and disease prevention programs due to elevated risks. Multiple years of BRFSS data are useful for tracking Pennsylvania's progress in achieving selected Healthy People 2030 National Health Objectives. Data from Pennsylvania, when compared to similar data from other states, identifies the need for increased health promotion and disease prevention program efforts. In 2021, comparable data were available from all 50 states, the District of Columbia, Puerto Rico, the U.S. Virgin Islands and Guam.

On the local level, BRFSS data may also be used to estimate the prevalence of risks in local areas, such as counties, if the data are combined for several years or the counties or county groups of interest are oversampled. However, for most counties, the number of respondents in the BRFSS sample data set is insufficient to produce reliable estimates.

In cases where local data on behavioral risk are not available, synthetic estimates can be calculated based on either national data or statewide data from the BRFSS. Synthetic estimates are calculated using population estimates for subgroups of interest and the state or national risk factor prevalence rates for those groups. Below is an example of how one can calculate synthetic estimates for a local area:

Step 1
Obtain the population estimates for the local geographic area of interest. Sum the population estimates into a table with the same breakdown as a table listing the national or state estimates (see the table below).

Step 2
To estimate the number of persons who have the behavioral risk in each subgroup, multiply the subgroup-specific rates by the population estimates for each group. For example, multiply the 2020 (latest available) Dauphin County census population of 42,156 for ages 18-29 by the 2021 fair or poor health prevalence of 10 percent (0.10) for that age group at the state level. The 2021 synthetic estimate for those in fair or poor health ages 18-29 in Dauphin County is 4,216.

Step 3
To obtain the total number of persons who indicated fair or poor health, repeat Step 2 for all subgroups and then sum the subgroup estimates to get a total estimate.

Age Group 2020 Dauphin County Census Population Fair or Poor Health from 2021 Pa. BRFSS Estimate of Dauphin County Adults Indicating Fair or Poor Health, 2020
18-29 42,156 x 10% = 4,216
30-44 51,157 x 8% = 4,093
45-64 74,527 x 19% = 14,160
65+ 46,922 x 26% = 12,200
Total 34,669

Step 4
To calculate the synthetic estimated percentage of Dauphin County adults with fair or poor health, pull the "Total Estimated Number of Adults" and the "Total Population Age 18+" in Dauphin County from "Step 3."

Total Synthetically Estimated Number of Adults with Fair or Poor Health in Dauphin County = 34,669
Total Population Age 18+ in Dauphin County = 214,762

Divide the synthetically estimated number of adults with fair or poor health by the adult population. Then multiply by 100 so that the result will be expressed as a percentage.

Synthetically Estimated Percentage with Fair or Poor Health in Dauphin County = (34,669/214,762) X 100
Synthetically Estimated Percentage with Fair or Poor Health in Dauphin County = 16.1 Percent

This step gives you a synthetically estimated percentage of adults.

Caution: Synthetic estimates can be useful for planning purposes. However, these estimates should not be used if there is reason to believe that local rates for subgroups of interest would diverge widely from the state or national rates. The prevalence of most health-related conditions varies considerably with age and often with other factors, such as sex, race and income. A more precise estimate may be obtained using age, sex and race-specific prevalence rates. The BRFSS is not a reliable source of prevalence rates specific to age-sex-race categories; national data would be a more reliable basis for synthetic estimates.

It is important to qualify estimates whenever they are used. A clear citation of the sources of the data used to calculate the local area synthetic estimates should be included in every report of the estimates.